##    id                         name latitude longitude dpcapacity
## 1  43 Michigan Ave & Washington St 41.88389 -87.62465         43
## 2  44       State St & Randolph St 41.88473 -87.62773         27
## 3  33      State St & Van Buren St 41.87718 -87.62784         27
## 4 199       Wabash Ave & Grand Ave 41.89174 -87.62694         15
## 5  51       Clark St & Randolph St 41.88458 -87.63189         31
## 6  98   LaSalle St & Washington St 41.88266 -87.63253         15
##   onlinedate
## 1 2013-06-16
## 2 2013-06-16
## 3 2013-06-25
## 4 2013-08-10
## 5 2013-06-17
## 6 2013-07-15

Data Introduction

Chicago has a bike share program known as Divvy. The data for bike rides in 2013 and 2014 was recently made available to the public. There were two main datasets included in the data made available to the public: a station information dataset and a trips information dataset. I elected to work with the Q1Q2 2014 datasets because the program was relatively new in 2013 (and had some odd qualities). I analyzed rides that took place in May and JUne—relatively nice weather months in Chicago.

Metadata for Trips Table: Variables: trip_id: ID attached to each trip taken starttime: day and time trip started, in CST stoptime: day and time trip ended, in CST bikeid: ID attached to each bike tripduration: time of trip in seconds from_station_name: name of station where trip originated to_station_name: name of station where trip terminated from_station_id: ID of station where trip originated to_station_id: ID of station where trip terminated usertype: “Customer” is a rider who purchased a 24-Hour Pass; “Subscriber” is a rider who purchased an Annual Membership gender: gender of rider birthyear: birth year of rider

Notes: * First row contains column names * Total records = 905,699 (but for May and June 2014 there were 624,752 stations) * Trips that did not include a start or end date were removed from original table. * Gender and birthday are only available for Subscribers

##   to_station_id from_station_id trip_id        stoptime bikeid
## 1             5             143 1786322 5/24/2014 13:10    386
## 2             5              36 1601544   5/8/2014 9:00    982
## 3             5             191 1936405   6/3/2014 7:49   1524
## 4             5              98 1667049 5/13/2014 19:12   2878
## 5             5              75 2235716 6/23/2014 15:51   1295
## 6             5              77 2307022  6/28/2014 1:36   1774
##   tripduration          from_station_name        to_station_name
## 1         3048  Sedgwick St & Webster Ave State St & Harrison St
## 2          314 Franklin St & Jackson Blvd State St & Harrison St
## 3          492       Canal St & Monroe St State St & Harrison St
## 4          396 LaSalle St & Washington St State St & Harrison St
## 5          295    Canal St & Jackson Blvd State St & Harrison St
## 6          536    Clinton St & Madison St State St & Harrison St
##     usertype gender birthyear           starttime user_age triptimemin
## 1   Customer               NA 2014-05-24 12:20:00       NA          51
## 2 Subscriber   Male      1954 2014-05-08 08:54:00       59           5
## 3 Subscriber   Male      1976 2014-06-03 07:40:00       37           8
## 4 Subscriber   Male      1980 2014-05-13 19:05:00       33           7
## 5 Subscriber   Male      1984 2014-06-23 15:46:00       29           5
## 6 Subscriber   Male      1992 2014-06-28 01:27:00       21           9
##        day                     name.x latitude.x longitude.x dpcapacity.x
## 1 Saturday  Sedgwick St & Webster Ave   41.92197   -87.63854           15
## 2 Thursday Franklin St & Jackson Blvd   41.87771   -87.63532           31
## 3  Tuesday       Canal St & Monroe St   41.88070   -87.63947           23
## 4  Tuesday LaSalle St & Washington St   41.88266   -87.63253           15
## 5   Monday    Canal St & Jackson Blvd   41.87799   -87.64077           23
## 6 Saturday    Clinton St & Madison St   41.88158   -87.64128           23
##   onlinedate.x                 name.y latitude.y longitude.y dpcapacity.y
## 1   2013-08-03 State St & Harrison St   41.87396   -87.62774           19
## 2   2013-06-15 State St & Harrison St   41.87396   -87.62774           19
## 3   2013-08-07 State St & Harrison St   41.87396   -87.62774           19
## 4   2013-07-15 State St & Harrison St   41.87396   -87.62774           19
## 5   2013-06-15 State St & Harrison St   41.87396   -87.62774           19
## 6   2013-06-15 State St & Harrison St   41.87396   -87.62774           19
##   onlinedate.y  startdate weekend
## 1   2013-06-18 2014-06-30 Weekend
## 2   2013-06-18 2014-06-30    Week
## 3   2013-06-18 2014-06-30    Week
## 4   2013-06-18 2014-06-30    Week
## 5   2013-06-18 2014-06-30    Week
## 6   2013-06-18 2014-06-30 Weekend

Metadata for Stations table:

Variables:

name: station name
latitude: station latitude longitude: station longitude dpcapacity: number of total docks at each station as of 8/20/2014 online date: date the station went live in the system

##    id                         name latitude longitude dpcapacity
## 1  43 Michigan Ave & Washington St 41.88389 -87.62465         43
## 2  44       State St & Randolph St 41.88473 -87.62773         27
## 3  33      State St & Van Buren St 41.87718 -87.62784         27
## 4 199       Wabash Ave & Grand Ave 41.89174 -87.62694         15
## 5  51       Clark St & Randolph St 41.88458 -87.63189         31
## 6  98   LaSalle St & Washington St 41.88266 -87.63253         15
##   onlinedate
## 1 2013-06-16
## 2 2013-06-16
## 3 2013-06-25
## 4 2013-08-10
## 5 2013-06-17
## 6 2013-07-15

I also thought that it would be useful to add in some historical local weather information from wunderground.com in order to explain any low ridership days due to inclemental weather.

##        CDT Events       date
## 1 2014-5-1   Rain 2014-05-01
## 2 2014-5-2        2014-05-02
## 3 2014-5-3        2014-05-03
## 4 2014-5-4        2014-05-04
## 5 2014-5-5   Rain 2014-05-05
## 6 2014-5-6        2014-05-06

Data Analysis

I initially thought it might be useful to look at the number of rides per day across users. The following graph also encodes information about whether or the day was a weekend day or if the day happened to have rain or thunderstorms. In general, subscribers of the service seem to use the ride service much more frequently than customers (people without subscriptions); if customers use the ride service on a given day more than regular subscribers, it seems to be a Friday or weekend day. There also seemsto be an uptick in rides in June, I believe this porbably is related to the summer holidays. Subscribers also seem to be less negatively affected by rain and thunderstorms than customers without a subscription.

I also thought it might be useful to look at the total number of rides on a weekday basis by gender (information only available for subscribers). As you can see, males seems to make up a much larger portion of subscribers or the service than females. The average age of male riders also seems to be slightly higher and more right skewed than for females. Perhaps Divvy should run a promotion to offer females a discounted first year membership.

Additionally, I thought it might be interesting to look at the average number of rides per station by geographical location. As you can see subscribers tend to start their trips at bike stations just slightly further out of the city center than nonsubscribers; this is probably due to the fact that nonsubscribers can be tourists just renting bikes for a day of sightseeing.

The pick up of bikes from stations does not appear to change too much from weekdays versuses weekends for both user types; it appears like the two groups have different patterns in general, but not much change in riding types over the weekend within the groups.

Subscribers seem to have the same riding habits over the week and weekend in terms of bike ride time. However, nonsubscribing customers tend to have much longer average ride times during the weekend.